Presentation to Ann Arbor R Users’ Group, Ann Arbor, MI
Center for Statistical Training and Consulting, Michigan State University
2024-11-14
Let’s define some concepts.
… is achieved when investigators share all the materials required to exactly recreate the findings so that others can verify them or conduct alternative analyses.
Repeatable
Reproducible
Replicable
RR is a product of how we work, not which methods we use.
Several forces are promoting and enabling the push toward reproducibility.
Irreproducible < Reproducible < Replicated
Important
Reproducibility is an attainable minimum standard for science[1].
Data sharing and reproducibility initiatives
Understand the criteria, then apply principles, practices, & tools.
| Materials | Findings |
|---|---|
| Manuals & procedures | Statistics |
| Instruments & scoring rules | Coefficients & p-values |
| Codebooks | Confidence intervals |
| Methods applied | Effect sizes |
| Data mgt decisions | Model fit indices |
| Data files | Figures |
| Software & analysis scripts | Tables |
“Captain, you’re asking me to work with equipment which is hardly very far ahead of stone knives and bearskins.”
Star Trek (1966) - S01E28 The City on the Edge of Forever
A compendium organizes digital files so others can review or use them to reproduce results, or do new analyses.[13] It should:
… are folders of digital files designed for sharing code & help documentation.[14] They
Important
Use an R package for a research compendium![13]
… are folders of files being tracked by Git for version control purposes.[15,16] They:
Important
Put your research compendium in a Git repository!
Treat RR as a form of collaborating with multiple stakeholders. Give them clear, useful, and relevant materials.
Who are your key collaborators?
Add diagram / photo illustrating collaboration with self, team-mates, and readers
What should you preserve?